A Uyghur Morpheme Analysis Method based on Conditional Random Fields
نویسندگان
چکیده
Morpheme analysis is very important for Uyghur language processing. Morpheme analysis of Uyghur is quite different from other language, for this task the keys include feature selection and the design of a morpheme annotated corpus . In this paper we propose a new statistical-based Uyghur morpheme analysis method by using Conditional Random Fields (CRFs) model. The preliminary experiment results demonstrate that the proposed method is effective;the F-measure of morpheme analysis reaches 87% in the open test.
منابع مشابه
Bidirectional Long Short-Term Memory Network with a Conditional Random Field Layer for Uyghur Part-Of-Speech Tagging
Uyghur is an agglutinative and a morphologically rich language; natural language processing tasks in Uyghur can be a challenge. Word morphology is important in Uyghur part-of-speech (POS) tagging. However, POS tagging performance suffers from error propagation of morphological analyzers. To address this problem, we propose a few models for POS tagging: conditional random fields (CRF), long shor...
متن کاملUyghur Short Text Classification Using Morphological Information
In this paper, we propose a novel method for improving the classification performance of short text strings using conditional random fields (CRFs) that combine morphological information. Experimental results on three datasets (Uyghur, Chinese, and English) demonstrate that our method can yield higher classification accuracy than Support Vector Machine (SVM) classifier and Maximum Entropy Model ...
متن کاملLog-linear Models for Uyghur Segmentation in Spoken Language Translation
To alleviate data sparsity in spoken Uyghur machine translation, we proposed a log-linear based morphological segmentation approach. Instead of learning model only from monolingual annotated corpus, this approach optimizes Uyghur segmentation for spoken translation based on both bilingual and monolingual corpus. Our approach relies on several features such as traditional conditional random fiel...
متن کاملCross-lingual Word Segmentation and Morpheme Segmentation as Sequence Labelling
This paper presents our segmentation system developed for the MLP 2017 shared tasks on cross-lingual word segmentation and morpheme segmentation. We model both word and morpheme segmentation as character-level sequence labelling tasks. The prevalent bidirectional recurrent neural network with conditional random fields as the output interface is adapted as the baseline system, which is further i...
متن کاملConditional Random Fields for Airborne Lidar Point Cloud Classification in Urban Area
Over the past decades, urban growth has been known as a worldwide phenomenon that includes widening process and expanding pattern. While the cities are changing rapidly, their quantitative analysis as well as decision making in urban planning can benefit from two-dimensional (2D) and three-dimensional (3D) digital models. The recent developments in imaging and non-imaging sensor technologies, s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. J. of Asian Lang. Proc.
دوره 19 شماره
صفحات -
تاریخ انتشار 2009